Method and apparatus for filtering E-mail

ABSTRACT

A server is disclosed for filtering e-mail messages. The server receives requests to retrieve e-mail messages on behalf of a client and then retrieves e-mail messages from a mail server on behalf of the client. The server then filters the e-mail messages based on one or more rules and transfers the filtered e-mail messages to the client. In addition, the server continues to filter the e-mail messages after the client has disconnected from the server. In one embodiment of the invention the e-mail message recipient is sent a notification by the server indicating that messages have been filtered. The recipient is then able to scan the filtered messages and insure that the messages have been filtered correctly. In another embodiment, a third party scans the e-mail messages on behalf of the e-mail user to make this determination. Also disclosed is an e-mail filter comprising an application programming interface and a plurality of dynamically loaded rule modules adapted to interface with the API. The rule modules are activated and deactivated based on usage. Specifically, rule modules which have not been used for a predetermined period of time are deactivated. In addition, different rule modules are assigned different weighted values based on the probability that the rule module will accurately filter e-mail messages and/or on the content of the e-mail messages.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates to the filtering of electronic mail (hereinafter“e-mail”). More particularly, the present invention relates to a methodand apparatus by which dynamic filtering techniques are applied tofilter out unwanted e-mail messages at various stages of transmissionacross one or more networks.

2. Description of the Related Art

The rapid increase in the number of users of e-mail and the low cost ofdistributing electronic messages via the Internet and other electroniccommunications networks has made marketing via the Internet anattractive medium for productivity, workflow, advertising, and businessand personal communication. Consequently, e-mail and Internet newsgroupsare now frequently used as the medium for widespread marketingbroadcasts. These unwanted messages are commonly referred to as “spam.”

Spam is more than just an annoyance to Internet users—it represents asignificant threat to the stability of vast numbers of computers andnetworks which comprise the Internet community. Internet serviceproviders (hereinafter “ISPs”), online services, and corporations spendmillions of dollars each year attempting to control spam. In fact, somespam distributions are so large in scope that they have the ability tocrash large ISP and corporate servers. One of the reasons why spam is sopervasive is that spammers require only a computer, an address list andInternet access to distribute spam to potentially millions of Internetusers. In sum, if not properly controlled, spam is capable of disablingsignificant portions of the Internet.

There are a number of known methods for filtering spam includingRealtime Blackhole List (“RBL”) filtering, Open Relay Blocking System(“ORBS”); and Procmail rules and recipes. Frequently, these methods aredesigned to block spam from particular e-mail addresses from which spamis known to originate. For example, filtering methods used by America OnLine® and Prodigy® use exclusion filters which block e-mail messagesreceived from addresses that are suspected sources of spam. However,this approach is vulnerable to rapid changes in the source ofunsolicited e-mail. Furthermore, because online services will generallynot automatically block e-mail addresses from their members, theseservices are provided only if the user requests them.

One additional known e-mail filtering techniques are based upon aninclusion list, such that e-mail received from any source other than onelisted in the inclusion list is discarded as junk. However, thesemethods require the user or the service provider to continually updatethe inclusion list manually because, like viruses, spam is constantlybeing modified to bypass static filters. If the inclusion list is notupdated regularly, the list will quickly become outdated, resulting inthe exclusion of desired e-mail messages from new sources and thecontinued inclusion of spam from old sources.

The Assignee of the present invention has developed improved techniquesfor filtering e-mail. Some of these techniques are described in relatedU.S. patent applications entitled UNSOLICITED E-MAIL ELIMINATOR U.S.Pat. No. 5,999,992, and APPARATUS AND METHOD FOR CONTROLLING DELIVERY OFUNSOLICITED ELECTRONIC MAIL U.S. Pat. No. 6,052,709.

The present application sets forth additional techniques for filteringe-mail. What is needed is an improved system and method for dynamicallyupdating e-mail filter technology to meet the threat posed byever-changing varieties of spam. In addition, what is needed is a systemand method for filtering e-mail which can be easily implemented by thetypical e-mail user. What is also needed is an anti-spam system andmethod which can be applied without the need to upgrade computer systemsand networks currently available.

SUMMARY OF THE INVENTION

Disclosed is a server having a processor and a memory coupled to theprocessor, the memory having stored therein sequences of instructionswhich, when executed by the processor, cause the processor to performthe steps of: (1) receiving a request to retrieve e-mail messages onbehalf of a client; (2) retrieving one or more e-mail messages from amail server on behalf of the client; and (3) filtering the e-mailmessages based on one or more rules to produce one or more filterede-mail messages.

Also disclosed is a first server having a processor and a memory coupledto the processor, the memory having stored therein sequences ofinstructions which, when executed by the processor, cause the processorto perform the steps of: retrieving messages from a second server onbehalf of a client; sorting messages into two or more groups based onone or more rules; and forwarding messages sorted into one of the groupsto the client.

Also disclosed is an e-mail filter comprising an application programminginterface (“API”) and a plurality of rule handling filter modulesadapted to interface with the API.

BRIEF DESCRIPTION OF THE DRAWINGS

A better understanding of the present invention can be obtained from thefollowing detailed description in conjunction with the followingdrawings, in which:

FIG. 1 illustrates generally a data network through which two clientsand a server communicate.

FIG. 2 illustrates the data network of FIG. 1 wherein an embodiment ofthe server of FIG. 1 includes an e-mail filter.

FIG. 3 illustrates the data network of FIG. 1 including a proxy serverused to filter e-mail.

FIG. 4 is a signal diagram illustrating communication between a mailserver and a client.

FIG. 5 is a signal diagram illustrating communication between a client,a proxy server, and a mail server.

FIG. 6 is a signal diagram illustrating communication between a client,a proxy server, and a mail server.

FIG. 7 is one embodiment of an e-mail filter.

FIG. 8 is one embodiment showing two or more user accounts establishedon a server or a proxy server.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1 generally depicts a network 100 over which client 110, client120, and server 130 communicate. Clients 110, 120 and mail server 130are computers, each comprising a conventional processor and a memorywith which software implementing the functionality of the presentinvention is executed. In one embodiment, network 100 is the Internetand clients 110, 120 and mail server 130 communicate using the wellknown TCP/IP protocol. In this embodiment, one or more of the clients110, 120 may use a modem to dial out over a standard telephone line toestablish a communication channel with mail server 130 over network 100.Alternatively, clients 110, 120 or mail server 130 may connect tonetwork 100 using a digital T1 carrier or an ISDN channel. In yetanother embodiment, network 100 is a local area network (hereinafter“LAN”) over which clients 110, 120 or mail server 130 communicate.

Depending on the system configuration, however, one of ordinary skill inthe art will readily recognize from the following discussion thatdifferent types of mail servers, clients and software could be employedwithout departing from the underlying principles and scope of thepresent invention. Accordingly, while the embodiment discussed belowuses an Internet communication channel for communication between clients110, 120 and mail server 130, numerous other communication schemes suchas a direct connection to mail server 130, etc., could be implemented aswell.

The following is a general description of how client 120 sends e-mail toclient 110. Each client 110, 120 is capable of executing e-mailapplication programs, generally illustrated in FIG. 1 as user agent 115and user agent 125. User agents 115, 125 are capable of acceptingcommands for composing, receiving, and replying to e-mail messages.E-mail user agents known in the art include Microsoft Outlook®, Lotuscc:mail®, Lotus Notes®, Eudora®, Novell Groupwise® and NetscapeCommunicator®. To send an e-mail to client 110, client 120 provides itsuser agent 125 with a message and a destination address. The destinationaddress uniquely identifies client 110's mailbox 140 on mail server 130.Server 130 in this embodiment provides a message transport system(hereinafter “MTS”) for receiving and storing incoming e-mail messagesin mailbox 140. E-mail is sent over the Internet from client 120 to mailserver 130 via a TCP connection to port 25. At the application level,the protocol used to send e-mail is usually the Simple Mail TransferProtocol (hereinafter “SMTP”).

If client 110 is connected to the same LAN as mail server 130, client110's user agent 115 may continually check mailbox 140 (e.g., every 10minutes) for new e-mail messages. Alternatively, client 110 may need toconnect to mail server 130 by dialing out over a telephone line using amodem (i.e., either by dialing directly to mail server 130 or dialingout and connecting to mail server 130 over the Internet). This wouldtypically be the case for a home computer user who has an account withan Internet Service Provider or online service, or a user who must dialout to connect to his corporate LAN.

For the purposes of the following discussion, when client 110 dials outto connect to mail server 130 to check mailbox 140, it communicates withmail server 130 using “Post Office Protocol 3” (hereinafter “POP3”). Itshould be noted, however, that different, standard and proprietary mailprotocols such as the Interactive Mail Access Protocol (hereinafter“IMAP”), the Distributed Mail System Protocol (hereinafter “DMSP”),X.400, and Lotus Notes may be implemented without departing from thescope of the present invention.

FIG. 4 is a signal diagram which illustrates the interaction betweenclient 110 and mail server 130 using the POP3 protocol. Theauthorization state is comprised of user agent 115 sending client 130 auser name and password (at 400 and 410, respectively). Based on the username and password, mail server 130 determines whether client 130 shouldbe provided access to mailbox 140. If access is permitted (i.e., thesubmitted user name and password are correct) mail server 130 respondswith a positive status indicator 415. The session then enters thetransaction state.

In the transaction state user agent 115 sends mail server 130 a LISTcommand at 420. If there is new mail in mailbox 140, server 130 respondsto the LIST command with the number of new messages—100 in the example(at 425). User agent 115 then retrieves the e-mail messages one at atime using a RETR command (starting at 430). Finally, after user agenthas received all 100 messages, user agent 115 issues the QUIT command at460 and the POP3 connection is terminated at 465. This final state isthe referred to as the update state. During the update state mail server130 updates mailbox 140 by removing all messages marked as deleted andreleases its lock on mailbox 140.

If client 120 previously sent an e-mail message to client 110 asdescribed above, client 120's e-mail will be one of the 100 messagesretrieved from mail server 130. However, if client 120 is a spammer,sending client 110 an unwanted solicitation, client 110 will have wastedhard drive space, line access time/cost and personal time downloadingthe e-mail message. This inefficiency and waste becomes more significantif a substantial percentage of all 100 e-mail messages stored in mailbox140 are spam. Moreover, if client 110 is communicating with mail server130 through a modem dial-up connection (rather than over a LAN), thetime and cost associated with downloading unwanted content is even moresignificant given the current speed limitations of modems (currentlyless than 56,000 bits/sec) and the access cost of ISPs. Accordingly, itwould be beneficial to provide an e-mail filter to remove spam frommailbox 140 before time is wasted downloading and storing it.

FIG. 2 illustrates one method for solving this problem. Mail server 130of FIG. 2 includes an e-mail filter module 220, a plurality of anti-spamrules 210, mailbox 140, and a spam storage area 230. All incoming e-mailinitially passes through filter module 220. Filter module 220 applies aset of rules 210 for detecting spam. Spam is then deposited in a spamstorage area 230 while legitimate e-mail is sent through to mailbox 140.In an alternative embodiment, spam is initially stored in a mailbox andis subsequently filtered using filter module 220.

As set forth in U.S. Pat. No. 6,052,709 entitled “APPARATUS AND METHODFOR CONTROLLING DELIVERY OF UNSOLICITED E-NAIL”, rules 210 applied tofilter module 210 may be established based on information collectedusing “spam probes.” A spam probe is an e-mail address selected to makeits way onto as many spam mailing lists as possible. It is also selectedto appear high up on spammers' lists in order to receive spam mailingsearly in the mailing process (e.g., using the e-mail address“aardvark@aol.com” ensures relatively high placement on an alphabeticalmailing list) herein incorporated by reference.

Spam collected from various spam probes is then analyzed and rules 210are established based on this analysis. For example, in one embodiment,source header data from incoming e-mail is analyzed. If the networkaddress contained in the source header is identified as the networkaddress of a known spammer, a rule will be established to filter allincoming e-mail from this network address into the spam storage area230. Rules 210 may also be established which identify spam based on amathematical signature (e.g., a checksum) of the spam e-mail body (orportions of the e-mail body). Any incoming message which contains theidentified signature will subsequently be forwarded into the spamstorage area 230. Rules 210 based on keywords in the subject or body ofspam e-mail may also be established. For example, all e-mails containingthe two words “sex” and “free” may be identified as spam and filtered.In one embodiment, mail marked as potential spam is transferred to acontrol center where it is inspected (e.g., by a computer technician)before being sent to the spam storage area 230.

In addition, a dynamically updated inclusion list may be generated atserver 130. The inclusion list initially screens e-mail header fields ofall incoming e-mail messages. All messages that appear on the list arepassed through to mailbox 140. E-mail received from sources other thanthose on the inclusion list are processed by additional anti-spam rules210 as described herein. In one embodiment, the inclusion list is simplyone of the plurality of rules 210 acting on filter module. In anotherembodiment, the inclusion list is a separate software module executingon server 130. The end result is the same regardless of which embodimentis used.

E-mail Filter API

Because spam is continuously being created and modified by spammers, asystem is needed in which rules for filtering spam can be easilyupdated. In addition, an e-mail filtering system is needed which iseasily modified to suit the unique preferences of individual e-mailusers and which can be easily adapted to run on proprietary e-mailsystems. To address these and other issues an improved filter module 700is illustrated in FIG. 7. The improved filter module 700 includes ananti-spam application programming interface (hereinafter “API”) 710which runs on a message transfer agent (705) or other components of ane-mail transmission system including a POP3 server. As is known in theart, an API includes a plurality of subroutines which can be invoked byapplication software (i.e., software written to operate in conjunctionwith the particular API). Thus, in FIG. 7 rule handling filter modules720, 730, 740, 750, and 760 are dynamically loaded into memory to meetthe needs of new filtering or e-mail processing methods. The modulesinterface with API 710 by making calls to the API's set of predefinedsubroutines. In one embodiment, a portion of the API 710 subroutines anda set of prefabricated rule handling filter modules can be marketed as aSoftware Development Kit (hereinafter “SDK”). This will allow ISPs,corporations and/or end-users to customize the type of e-mail filteringwhich they require. In addition, because modules 720-760 are dynamicallylinked, they may be loaded and unloaded without having to shut down theapplication or reboot the system on which the filter module is executed.

Referring again to FIG. 7, in one embodiment, each one of the rulehandling filter modules 720-760 filters spam based on differentcriterion. For example, filter module RS(A) 720 may filter spam based ona specific keyword search, whereas filter module RS(B) 730 may filterspam based on a mathematical signature (e.g., a checksum), RS(E) 760 mayperform a virus check on incoming e-mail messages, and RS(C) 740 may bean inclusion list. Other contemplated rule handling filter modules willfilter e-mail based on: (1) word or letter frequency analysis; (2) IPsource frequency analysis; (3) misspelling analysis (unwanted e-mailoften contains misspelled words); (4) word or letter combinationanalysis; (5) technical or legal RFC822 header compliance; and (6)feature extraction & analysis (e.g., based on phone numbers, URL's,addresses, etc.). It should be noted that all of the rule handlingfilter modules described herein may be combined or applied over adistributed array of filters throughout a network.

Referring again to FIG. 7, if a spammer makes a slight modification tohis message in order to circumvent, for example, the mathematicalsignature module RS(B) 730, the module can be updated to include the newmathematical signature of the spammer's modified message withoutaffecting the remaining modules. Thus, an improved filter module 700 isdescribed which allows continuous, user-specific modifications to aplurality of rule modules 720-760.

It should be noted that the rule-based filtering method and apparatusdescribed herein can be implemented at virtually any point along thee-mail transmission path from client 120 to server 130 to client 110.For example, on mail server 130 an embodiment of filter module 700 canbe located at a point in server 130 where e-mail messages aretransmitted to client 110 or at a point where the messages are received(as shown in FIG. 2). In an alternative embodiment, mail server 130initially stores all incoming messages in a message store andperiodically applies filter 700 to all message in the message store.

In addition, Client 110 can itself contain an embodiment of filtermodule 700. Therefore, Client 110 can apply filter module 700periodically as described above (with reference to server 130) or,alternatively, can apply filter module 700 to all incoming e-mailmessages. Filter module 700 can also be applied within a mail relayresiding on network 100. In sum, filter module 700 may be applied at anynode through which e-mail messages are transmitted.

Rule Aging and Weighting

In another embodiment, the rules 210 used by filter module 220 arecontinually monitored by server 130. If a rule has not been used tofilter spam for a predetermined length of time (e.g., a month) that rulemay be moved from an active to an inactive state and no longer appliedto filter 220 (unless the type of spam for which it was createdreappears). This type of rule-aging system is useful given the fact thatolder types of spam are continually replaced with new types. If notremoved as described above, the number of outdated rules would build upan unmanageable level and filter module 220 may become inefficient atremoving spam (applying numerous obsolete rules to the incoming e-mailstream).

Additionally, in one embodiment, rules applied to filter module 220 areweighted based on one or more variables, including but not limited tospam content, probability of positive spam identification, and frequencyof use. For example, a rule which is geared towards screening e-mailmessages containing sexual content (e.g., in a home where children usethe computer) which filters e-mail based on the keywords “sex” and“free” may be given a weight value of 10 on a scale from 1 to 10.However, a rule which screens e-mail based merely on the keyword “free”may be weighted with, e.g., a weight value of 2.

E-mail messages can also be weighted based on the probability that thefilter has correctly identified the filtered e-mail message as spam. Forexample, a positive spam identification under a mathematical analysis(e.g., a checksum) will generally be accurate. Thus, an e-mail messageidentified as spam based on a mathematical calculation will be given ahigher weighted value than, for example, a keyword identification.

In addition, the rules applied to filter module 220 may be additive suchthat it may take several rules to fire to allow filter module 220 todecide that a message is spam. In such an embodiment, the relativeweights of the rules which identify the message as spam can be addedtogether to establish a cumulative weighted value. Moreover, a filtermodule 220 in this embodiment may be configured based on howaggressively filter module 220 should screen e-mail messages. Forexample, filter module 220 may configured to screen all e-mail messageswith a cumulative weighted value of 6.

Rule Prioritization Based on Usage and Detection Accuracy

In one embodiment, rules have two weights associated with them: (1) aspam weight—this weight indicates the certainty that if a rule firessuccessfully against a message, the message is Spam and (2) a priorityweight—this weight indicates the frequency of use and most recent use ofa rule. In addition, in this embodiment two different priority weightsmay be associated with a rule, a global priority weight and a localpriority weight.

Spam Weight is an arbitrary weight chosen by a rule designer based oninternal heuristics to reflect the certainty that the rule willcorrectly identify Spam. General rules, for example, that don'tdefinitively identify Spam but are designed to detect typical Spammerpractices, may be provided lower weights that rules which target aspecific type of spam. Thus, because of their low assigned weight,general rules may need to fire along with other rules (general or not)for spam to be filtered by filter module 700 (or 310, 850).

In this embodiment, new rules are assigned the highest Priority Weight.As statistics on rule usage are compiled, the priority weight may changeto reflect how often the rule is used. For example, if a new rule isgenerated and isn't used within a configured period of time, the rule'spriority weight will decrease. Similarly, if a rule has been usedrecently or frequently within a designated interval, the priority weightwill be increased. Global priority weight reflects a rule's usage at allknown filter modules across a network (e.g., network 100) whereas localpriority weight reflects a rule's usage at a specific node (e.g., filtermodule 310).

The spam weight and priority weight of a set of rules may be combinedmathematically to prioritize rules within the set. In one embodiment,the spam weight and priority weight of each rule within the set aremultiplied together and the results of the multiplications are orderedsequentially. Those rules associated with numerically higher results(i.e., which are used more frequently and are more likely to correctlyidentify spam) are assigned higher priorities within the set.

Moreover, rules assigned higher priorities will tend to be executedearlier against messages applied to the filter module 310, resulting ina more efficient spam filtration system. In other words, spam will beidentified more quickly because rules with a higher usage frequency anddetection accuracy will be implemented first. In addition, both theGlobal and Local Priority weights may be multiplied with a rule's SpamWeight as described above to evaluate new orderings within the set ofrules.

Proxy Server E-mail Filter

One problem associated with the implementation of filter modules 220 and700 as described above is that in order for client 110 to receive thebenefit of a server-side filter module 220, it must be set up andmaintained by the administrator of server 130. For example, if the userof client 110 belongs to an online service—say, for example,Netcom®—that user will not be able to implement a server-side filter ifNetcom does not provide one. Moreover, assuming arguendo that Netcomoffers a server-side e-mail filter, the user of client 110 will belimited to the type of filter offered. If the filter is merely a staticfilter, the user will not be able to significantly tailor it to hisspecific preferences.

As described above, user agent 115 executed on client 110 allows client110 to check for e-mail in mailbox 140 on mail server 130. E-mail useragents known in the art include Microsoft Outlook®, Lotus cc:mail®,Lotus Notes®, Novell Groupwise®, Eudora®, and Netscape Communicator®. Inorder for any one of these user agents to retrieve mail from mailbox 140on mail server 130, the user agent must be configured with the correctnetwork address (e.g., “mail.isp.com” as shown in FIG. 3) and thecorrect mail protocol (e.g., POP3 in our example). If configuredproperly, user agent 115 will initially send a user name and password toopen communication with mail server 130 “mail.isp.com” during the POP3authentication stage (as described in detail above with reference toFIG. 4).

FIG. 3 illustrates one embodiment of the present system and method inwhich a proxy server 300 is used to retrieve and filter e-mail initiallystored in mailbox 140. Proxy server 300 is comprised of a mail storagemodule 330, a filter module 310, a spam storage area 340 and a pluralityof rules 320. In this embodiment, user agent can be reconfigured so thate-mail is no longer received directly from mailbox 140. Rather, e-mailwill first be transferred through filter module 310 in proxy server 300to be filtered before being transferred to user agent 115. In oneembodiment, this is accomplished by changing the mail server addresslisted in user agent 115 from “mail.isp.com” to the network address ofthe proxy server, “mail.proxy.com.”

Thus, referring now to FIG. 5 as well as FIG. 3, when user agent 115 isexecuted, it initially sends its user name to “mail.proxy.com,” ratherthan “mail.isp.com” where its mailbox resides (signal 505). Proxy server300 then communicates the user name to mail server 130 on behalf ofclient 110 (signal 510). Once mail server 130 responds with a positiveindication (signal 515) proxy server 300 passes on the response toclient 110 (signal 520). Client then transmits a password associatedwith the user name to proxy server 300. Once again, proxy server 300forwards this information to mail server 130 on behalf of client 110(signal 530) and forwards the response from mail server 130 (signal 535)on to client 110 (signal 540). In another embodiment, proxy serverrequests both the password and the user name from client 130 beforeinitiating contact with mail server 130.

Referring now to FIG. 6, client 110 sends a LIST command to proxy server300 (signal 602), requesting a list of current e-mail messages, andproxy server 300 forwards the command to mail server 130 (signal 604) onbehalf of client 110. In response to the list signal, mail server 130sends a response to proxy server that there are currently 100 messagesstored in mailbox 140 (signal 606). Proxy server then begins retrievingeach of the 100 messages, starting with the RETR(1) signal 608 toretrieve message 1. Upon receiving message 1 (signal 610) proxy server300 applies filter module 310 to the incoming message to determinewhether message 1 is spam or legitimate e-mail. In one embodiment,filter module 310 also scans message 1 for computer viruses andprocesses message 1 accordingly. If message 1 is legitimate, it isstored in mail storage module 330 (to be subsequently transferred toclient 110). However, if message 1 is spam, it is filtered into spamstorage module 340. Similarly, if message 1 contains a computer virus,it can be filtered into a virus storage module (not shown).

In one embodiment, client 115 may send a request to proxy server 300 toview messages which have been filtered into spam storage module 340 (orinto a virus storage module). Client 110 can then insure that legitimatemessages have not been inadvertently filtered by filter module 310. Assuch, in this embodiment, proxy server 300 will store spam in spamstorage module 340 for a predetermined length of time. Alternatively,client 115 may be allocated a predetermined amount of memory in spamstorage module 340. When this memory has been filled up with spam, theoldest spam messages will be forced out to make room for new spammessages. In addition, client 110 may be periodically notified that spamhas been filtered into spam storage module 340. Once client 110 haschecked spam storage area 340 to view messages which have been filtered,the user may choose to redirect one or more filtered messages back intomail storage module 330 (illustrated as signal 335 of FIG. 3). Once thisdecision has been made, the user may modify one or more rule basedfilter modules 320 to ensure that this “type” of e-mail message is nolonger filtered.

Proxy server 300 continues to retrieve messages from mail server 130 oneat a time (signals 612 et seq.) until the last message—e.g., message 100(not illustrated in FIG. 6)—has been retrieved. Filter module 310applies its set of rules 310 to identify spam, computer viruses or othertypes of e-mail messages and sorts the retrieved messages accordingly.In the embodiment illustrated in FIG. 3 e-mail is filtered into eitherspam storage module 340 or mail storage module 330.

At some predetermined point in time (represented by the dashed line 624in FIG. 6) client 110 will require a response from proxy server 300. Inone embodiment, this time period 624 (hereinafter “timeout period”) isbased on how long user agent 115 on client 110 will wait for a mailserver response before timing out (and possibly issuing an error messageto the user). In another embodiment, the timeout period 624 isdetermined by how long a typical user will tolerate waiting for aninitial response from his/her mail server. Regardless of how the timeoutperiod 624 is calculated, once it has been reached, proxy server 300transfers all legitimate e-mail messages which have been processed(i.e., messages which have been identified as non-spam and virus-free byfilter module 310 and transferred to mail storage module 330) to client110.

In the signal diagram shown in FIG. 6, timeout period 624 is reachedafter proxy server 300 has processed 4 of the 100 e-mail messages frommail server 130. Of these four messages, messages 2 and 3 (signals 614and 618) have been identified as spam and transferred to spam storagearea 340. Messages 1 and 4 (signals 610 and 622), however, have beenidentified as legitimate. Thus, following timeout period 624, onlylegitimate messages 1 and 4 are presented to user agent 115 in responseto user agent 115's LIST command (presented as messages 1 and 2). Inresponse, client retrieves messages 1 and 4 (identified by client 110 asmessages 1 and 2) from mail storage module 330 on proxy server 300(signals 627 through 633). Client 110 then sends the QUIT command(signal 635) and ends the e-mail retrieval session with proxy server300.

Following timeout period 624, however, proxy server 300 continuesretrieving and processing the 96 e-mail messages remaining on mailserver 130 as described above. At some point after proxy server hascompleted processing the remaining 96 messages, user agent 115 on client110 sends another LIST command to proxy server 300, thereby requesting alist of current e-mail messages (signal 660). At this point, proxyserver 300 has processed the 96 additional e-mail messages followingtimeout period 624. For the purposes of the following discussion it willbe assumed that ½ of the 96 messages processed after timeout period 624were identified as spam by filter module 310. Thus, by the time client130 sends LIST signal 660, filter module 310 has transferred 48 messagesinto spam storage module 340 and 48 messages into mail storage module330.

Upon receiving the LIST signal 660 from client 110, proxy server 300sends a second list signal 665 to mail server 130 to determine whetheradditional e-mail messages addressed to client have been received atmailbox 140 since proxy server 300 and mail server 130 lastcommunicated. In the example illustrated in FIG. 6, server 130 sendssignal 670 indicating that one additional message has been receivedduring this period. Accordingly, proxy server 300 retrieves theadditional e-mail message (signals 675 and 680) and determines that itis legitimate by passing it through filter module 310. Proxy server 300then terminates communication with mail server 130 by issuing the QUITcommand (signal 685).

Thus, in response to client 110's LIST command 660, proxy server 300sends a response signal 690 indicating that 49 e-mail messages arecurrently available, all of which are legitimate e-mail messages. Client110 will subsequently proceed to retrieve each of the 49 remainingmessages from mail storage module 330 on proxy server 300 (althoughthese signals are not illustrated in the signal chart of FIG. 6). Theend result is that client 110 only downloads legitimate e-mail andthereby conserves time, access cost, and hard drive space. In addition,client 110 is able to implement the current system and method withoutmodifying his ISP account, online service provider account or corporatemail server account (all of which are represented by mail server 130).In other words, because proxy server 300 sends the user name andpassword of client 110 to mail server 130, mail server 130 assumes thatit is client 130 connecting in to retrieve mail. As such, nomodification is required to client 110's account on mail server 130.

In one embodiment, proxy server 300 will store cleaned e-mail messagesin mail storage module 330 even after the messages have been downloadedby client 110. This is done so that the messages do not have to befiltered a second time if client 110 need to access the messages again(e.g., if the user of client 110 attempts to check e-mail from adifferent computer).

Referring now to FIG. 8, in one embodiment, two or more user accountsmay be established on server 130 (FIG. 2) or proxy server 300 (FIG. 3)such that one user reviews the filtered e-mail of another user. Thus,user 810 is assigned a personal mail storage area 835 which containse-mail messages which have passed through filter module 850 (and havebeen identified as legitimate). User 810 also has access to a personalspam storage area 840 which contains e-mail messages identified as spamby filter module 850. In addition, user 810 can view messages stored ina spam storage area 828 assigned to user 820 and select (via controlunit 830) messages to pass through to user 820's mail storage 825. Thus,in this embodiment user 810 may be a parent who sets filter module 850to aggressively filter messages (i.e., to trigger at a low threshold asdescribed above) addressed to his/her child (user 820). User 810 canthen review the messages before allowing the child (user 820) to accessthe messages. Alternatively, user 810 may be a corporate networkadministrator reviewing filtered messages for an employee 820 todetermine whether the filtered messages have been filtered correctly.

In another embodiment, once a particular e-mail message has beenidentified as spam, only a single copy of that message is stored in spamstorage module 340, regardless of how many different e-mail users themessage is addressed to. This technique of caching only one copy of aspam message in spam storage module 340 allows proxy server 300 toconserve significant space in spam storage area 340 given the fact thatspam is commonly distributed to thousands of e-mail users at a time.

One of ordinary skill in the art will readily recognize from thefollowing discussion that alternative embodiments of the structures andmethods illustrated herein may be employed without departing from theprinciples of the invention. Throughout this detailed description,numerous specific details are set forth such as specific mail protocols(i.e., POP3) and filter applications (e.g., spam removal) in order toprovide a thorough understanding of the present invention. It will beappreciated by one having ordinary skill in the art, however, that thepresent invention may be practiced without such specific details. Inother instances, well known software-implemented communicationtechniques have not been described in detail in order to avoid obscuringthe subject matter of the present invention. The invention should,therefore, be measured in terms of the claims which follow.

1. A server having a processor and a memory coupled to the processor,the memory having stored therein sequences of instructions which, whenexecuted by the processor, cause the processor to perform the steps of:creating at least one fictitious email probe email address selected toappear on spam email mailing lists and to receive sample spam emailmessages; receiving a request to retrieve e-mail messages on behalf of aclient; retrieving one or more e-mail messages from a mail server onbehalf of the client; filtering the e-mail messages based on one or morerules to produce one or more filtered e-mail messages, the rules beingdynamically established utilizing sample email messages retrieved fromone or more probes and aged based on frequency of use; and transferringone or more of the filtered e-mail messages to the client, while storingthe e-mail messages not transferred to the client in a memory on theserver.
 2. The server as claimed in claim 1 including the initial stepof receiving a user name and a password from the client and transmittingthe user name and password to the mail server on behalf of the client.3. The server as claimed in claim 1 wherein one or more of the filterede-mail messages are transferred to the client before all of the e-mailmessages stored on the mail server have been retrieved from the mailserver.
 4. The server as claimed in claim 1 wherein one or more of thefiltered e-mail messages are transferred to the client before all of thee-mail messages have been filtered.
 5. The server as claimed in claim 3wherein the server continues to retrieve one or more e-mail messagesfrom the mail server after the client has disconnected from the server.6. The server as claimed in claim 4 wherein the server continues tofilter one or more of the e-mail messages after the client hasdisconnected from the server.
 7. The server as claimed in claim 1wherein the e-mail messages are filtered based on information in thee-mail message header.
 8. The server as claimed in claim 1 wherein thee-mail messages are filtered based on the address from which the e-mailmessages originate.
 9. The server as claimed in claim 1 wherein thee-mail messages are filtered based on keywords within the e-mailmessages.
 10. The server as claimed in claim 1 wherein the e-mailmessages are filtered based on a mathematical signature of the e-mailmessages.
 11. The server as claimed in claim 1 wherein the e-mailmessages are filtered based on whether the e-mail messages containscomputer viruses.
 12. The server as claimed in claim 1 wherein thee-mail messages are filtered based on an inclusion list.
 13. The serveras claimed in claim 1 wherein the filtering step is performed by afilter module comprised of an application programming interface (“API”)and one or more dynamically-linked rule modules.
 14. The server asclaimed in claim 1 wherein the server monitors how frequently each ofthe rules for filtering the e-mail messages are utilized.
 15. The serveras claimed in claim 14 wherein the rules for filtering the e-mailmessages become inactive if not utilized for a predetermined period oftime.
 16. The server as claimed in claim 1 wherein the server performsthe step of retrieving one or more e-mail messages from the mail serverusing the Post Office Protocol 3 (“POP3”).
 17. The server as claimed inclaim 1 wherein the server performs the step of retrieving one or moree-mail messages from the mail server using Interactive Mail AccessProtocol (“IMAP”).
 18. The server as claimed in claim 1 wherein theserver performs the step of retrieving one or more e-mail messages fromthe mail server using Distributed Mail System Protocol (“DMSP”).
 19. Theserver as claimed in claim 1 wherein the server communicates with theclient using a different mail protocol it uses to communicate with themail server.
 20. A first server having a processor and a memory coupledto the processor, the memory having stored therein sequences ofinstructions which, when executed by the processor, cause the processorto perform the steps of: creating at least one fictitious email probeemail address selected to appear on spam email mailing lists and toreceive sample spam email messages; retrieving messages from a secondserver on behalf of a client; sorting messages into two or more groupsbased on one or more rules, the rules being dynamically establishedutilizing sample email messages retrieved from one or more probes andaged based on frequency of use; and forwarding messages sorted into oneof the groups to the client, while storing the messages not forwarded tothe client in a memory on the first server.
 21. The first server asclaimed in claim 20 wherein the stored messages are deleted after apredetermined period of time.
 22. The first server as claimed in claim20 including the initial step of receiving a user name and a passwordfrom the client and transmitting the user name and password to thesecond server on behalf of the client.
 23. The first server as claimedin claim 20 wherein messages sorted into one of the groups aretransferred to the client before all of the messages stored on thesecond server have been retrieved from the second server.
 24. The firstserver as claimed in claim 20 wherein messages sorted into one of thegroups are transferred to the client before all of the messages havebeen sorted.
 25. The first server as claimed in claim 20 wherein thefirst server continues to retrieve one or more messages from the secondserver after the client has disconnected from the first server.
 26. Thefirst server as claimed in claim 20 wherein the first server continuesto sort messages after the client has disconnected from the firstserver.
 27. The first server as claimed in claim 20 wherein the rulesare weighted.
 28. The first server as claimed in claim 27 wherein therules are weighted based on the likelihood that the means for filteringhas accurately identified the e-mail message.
 29. The first server asclaimed in claim 28 wherein if more than one rule identifies the e-mailmessage as spam, the weights of the rules identifying the e-mail messageas spam are added together to produce a cumulative weighted value. 30.The first server as claimed in claim 29 wherein the filter is set tofilter e-mail messages based on a predetermined cumulative weightedvalue.
 31. The first server as claimed in claim 20 wherein the messagesstored in the memory of the first server are reviewed by messagerecipient to determine whether the messages have been grouped correctly.32. The first server as claimed in claim 20 wherein the messages storedin the memory of the first server are reviewed by a third party todetermine whether the messages have been grouped correctly.
 33. Ane-mail filter comprising: an application programming interface (“API”);a plurality of rule handling filter modules adapted to interface withthe API, the plurality of rule handling filter modules adapted to filtere-mail messages based on one or more rules to produce one or morefiltered e-mail messages, the filtered e-mail messages transferred to aclient, the rules being dynamically established utilizing sample emailmessages retrieved from one or more probes and aged based on frequencyof use, wherein one or more fictitious probes are created to appear onspam email mailing lists and to receive sample spam email messages; anda storage module on a server to store e-mail messages not transferred tothe client.
 34. The e-mail filter as claimed in claim 33 wherein one ofthe plurality of rule handling filter modules filters e-mail messagesbased on a mathematical signature of the e-mail messages.
 35. The e-mailfilter as claimed in claim 34 wherein the mathematical signature is achecksum.
 36. The e-mail filter as claimed in claim 33 wherein one ofthe plurality of rule handling filter modules filters e-mail messagesbased on one or more keywords within the e-mail messages.
 37. The e-mailfilter as claimed in claim 33 wherein one of the plurality of rulehandling filter modules filters e-mail messages based on informationwithin the e-mail message headers.
 38. The e-mail filter as claimed inclaim 33 wherein one of the plurality of rule handling filter modulesfilters e-mail messages based on the network address form which thee-mail messages originate.
 39. The e-mail filter as claimed in claim 33wherein the rule handling filter modules are weighted based on thelikelihood that they will accurately filter the e-mail messages.
 40. Thee-mail filter as claimed in claim 39 wherein if more than one rulehandling filter module identifies the email message as spam, the weightsof the rule handling filter modules identifying the e-mail message asspam are added together to produce a cumulative weighted value.
 41. Thee-mail filter as claimed in claim 39 wherein the email filter isconfigured to filter e-mail messages based on a predetermined cumulativeweighed value.
 42. A server having a processor and a memory coupled tothe processor, the memory having stored therein sequences ofinstructions which, when executed by the processor, cause the processorto perform the steps of: creating at least one fictitious email probeemail address selected to appear on spam email mailing lists and toreceive sample spam email messages; receiving a request to retrievee-mail messages on behalf of a client; retrieving one or more e-mailmessages from a mail server on behalf of the client; filtering thee-mail messages based on one or more rules to produce one or morefiltered e-mail messages, the rules being dynamically establishedutilizing sample email messages retrieved from one or more probes andaged based on frequency of use; and transferring one or more of thefiltered e-mail messages to the client, while storing the e-mailmessages not transferred to the client in a memory on the server. 43.The server as claimed in claim 42 including the initial step ofreceiving a user name and a password from the client and transmittingthe user name and password to the mail server on behalf of the client.44. The server as claimed in claim 42 wherein one or more of thefiltered e-mail messages are transferred to the client before all of thee-mail messages stored on the mail server have been retrieved from themail server.
 45. The server as claimed in claim 42 wherein one or moreof the filtered e-mail messages are transferred to the client before allof the e-mail messages have been filtered.
 46. The server as claimed inclaim 45 wherein the server continues to retrieve one or more e-mailmessages from the mail server after the client has disconnected from theserver.
 47. The server as claimed in claim 46 wherein the servercontinues to filter one or more of the e-mail messages after the clienthas disconnected from the server.
 48. The server as claimed in claim 42wherein the e-mail messages are filtered based on information in thee-mail message header.
 49. The server as claimed in claim 42 wherein thee-mail messages are filtered based on the address from which the e-mailmessages originate.
 50. The server as claimed in claim 42 wherein thee-mail messages are filtered based on keywords within the e-mailmessages.
 51. The server as claimed in claim 42 wherein the e-mailmessages are filtered based on a mathematical signature of the e-mailmessages.
 52. The server as claimed in claim 42 wherein the e-mailmessages are filtered based on whether the e-mail messages containscomputer viruses.
 53. The server as claimed in claim 42 wherein thee-mail messages are filtered based on an inclusion list.
 54. The serveras claimed in claim 42 wherein the filtering step is performed by afilter module comprised of an application programming interface (“API”)and one or more dynamically linked rule modules.
 55. The server asclaimed in claim 42 wherein the server monitors how frequently each ofthe rules for filtering the e-mail messages are utilized.
 56. The serveras claimed in claim 55 wherein the rules for filtering the e-mailmessages become inactive if not utilized for predetermined period oftime.
 57. The server as claimed in claim 42 wherein the server performsthe step of retrieving one or more e-mail messages from the mail serverusing the Post Office Protocol 3 (“POP3”).
 58. The server as claimed inclaim 42 wherein the server performs the step of retrieving one or moree-mail messages from the mail server using Interactive Mail AccessProtocol (“IMAP”).
 59. The server as claimed in claim 42 wherein theserver performs the step of retrieving one or more e-mail messages fromthe mail server using Distributed Mail System Protocol (“DMSP”).
 60. Theserver as claimed in claim 42 wherein the server communicates with theclient using a different mail protocol it uses to communicate with themail server.